-
Notifications
You must be signed in to change notification settings - Fork 7.6k
Improve TokenUtils performance through caching #9964
Conversation
|
||
while (TokenUtils.moveNextToken(ctx, false)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcelgerber Thanks for digging into this one. That's an impressive improvement.
I'm wondering if the main slow down is in TokenUtils.moveNextToken()
because it has to get the full line text for every token. I wonder if the line text can be cached in the "context" object to improve performance? Brackets seems to choke on files with long lines in many places so that would provide a benefit for many other functions and not just this one. Care to give that a try?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@redmunds is not getting the line, is mainly ctx.editor.getTokenAt()
which has to parse the entire line up to the given position every time, and essentially parsing the same content many times.
What we could do is, run cm.getLineTokens(line)
, save the tokens in the ctx and make moveNextToken
use those tokens until it gets to a new line, where it will call cm.getLineTokens(line)
, save it and return the first one.
BTW, why are the TokenUtils function that require a ctx and not a class that can be created one with methods that will not require the ctx?
3a1e205
to
7869026
Compare
@redmunds @TomMalbran I managed to come up with a caching solution in TokenUtils, which seems to work out quite well. |
@marcelgerber This is a great start! I verified that it fixes #9717, but it will need lots more testing. I think the (1 sec) timeout should be replaced by listening for the Document "change" event. If there's a change to the line that's cached, then clear it. Also, there can be multiple contexts at one time, so it might be worth having a cache for each context object. This is usually used to looking ahead to next (or back at previous) token when parsing without affecting current context, so it will usually use the same cached line, but this is something to keep in mind as a future improvement. |
return editor.getTokenAt(pos, precise); | ||
} | ||
var cachedTokens = _manageCache(editor, pos.line), | ||
token = _.find(cachedTokens, function (token) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_.find()
iterates over list which is a sequential search. For really long lines of text (e.g. minified files), it's probably worth implementing a binary search here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another idea is to keep track of last token, and pass in a first/last/prev/next/random flag to provide a hint where to start search. But, this may only work if each context has it's own cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Binary search sounds promising. I'll try that once I got the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be worth it for the future to have a Token class, with move methods and an internal cache, since it looks like after having a context we want to move it searching for something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually found out that Lodash has a binary search function (_.sortedIndex
- is there anything that's not in Lodash?!) and well, I used it :)
The |
Done with code review. |
Ugh, didn't notice there's a JSHint error. Ah, and the problem with |
Search both language/CSSUtils.js and language/HTMLUtils.js and for
As I said, this just needs to be kept in mind -- we might not need this. Seems like the worst performance is in minified files where everything is in (mostly) 1 line, so a single cache should work. |
Just switched About doc.change events: |
That's only true for
The problem with UPDATE: it looks like you can pass a function to |
@@ -158,14 +204,10 @@ define(function (require, exports, module) { | |||
* @return {mode:{Object}, name:string} | |||
*/ | |||
function getModeAt(cm, pos) { | |||
var outerMode = cm.getMode(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function was written this way for nested mode support, so we need to make sure that cm.getModeAt(pos)
supports that. I'm seeing the HTMLUtils InlineEditorProviders "should find style blocks" unit test fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested it, but the CM manual said it returns the mode at the given pos.
I'll look into the unit tests and the Travis failure soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, it works.
I checked it in a file with mixed mode (HTML/JS) and the unit tests pass for me as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only brought up TokenUtils.getModeAt()
because I thought that it may benefit from the cache, but that does not appear to be the case. To keep these improvements isolated so they can be measured separately, this change should be moved to another pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted the change.
It's actually not a Brackets Editor object, but these are actually only CodeMirror instances - so I can't use |
01c1925
to
2d920f5
Compare
You can listen to the CodeMirror instance "changes" event: cm.on("changes", function (instance, changeList) {
}); |
898fe35
to
23774c2
Compare
I have implemented the (still somewhat experimental) CM changes handler. |
@marcelgerber This looks good. I found another scenario where this helps and added a test case in description. Squash commits so I can merge this. |
18af3ec
to
03bd448
Compare
@redmunds Ready for merge. |
Travis CI is failing. It does not appear to be related to this code, but I want to make sure this passes before merging. |
03bd448
to
792467c
Compare
Merging. |
Improve TokenUtils performance through caching
For #9717.
Testing
Test 1
Using this file: https://gist.github.com/MarcelGerber/5c11f1031af292044708, place your cursor in the
<script>
tag and pressCtrl-Space
.Test 2
Using same file, place your cursor in the
<body>
tag and pressCmd/Ctrl-E
.